Skip to content

Add CollectionIndexOnNonIndexedSeq rule to Scapegoat#158

Closed
JoshRosen wants to merge 1 commit intoscapegoat-scala:masterfrom
JoshRosen:index-on-non-indexed-seq
Closed

Add CollectionIndexOnNonIndexedSeq rule to Scapegoat#158
JoshRosen wants to merge 1 commit intoscapegoat-scala:masterfrom
JoshRosen:index-on-non-indexed-seq

Conversation

@JoshRosen
Copy link
Copy Markdown
Contributor

Inspired by https://twitter.com/jshrsn/status/865677863646658560, this patch adds a new IndexingOnANonIndexedSeq rule which raises a warning when we see code which indexes on a Seq which isn't known to be an IndexedSeq subtype. The goal here is to make it easier to prevent cases where someone writes an O(n^2) loop by accident.

This rule should have prevented the following Spark bugs:

For comparision, here's the same rule implemented in linter: JoshRosen/linter#1

@JoshRosen JoshRosen closed this May 20, 2017
@JoshRosen JoshRosen deleted the index-on-non-indexed-seq branch May 20, 2017 21:54
@JoshRosen JoshRosen restored the index-on-non-indexed-seq branch May 20, 2017 21:54
@JoshRosen
Copy link
Copy Markdown
Contributor Author

(Sorry, meant to open this against my own fork instead)

@sksamuel
Copy link
Copy Markdown
Collaborator

I could add it anyway if you don't mind ?

@JoshRosen
Copy link
Copy Markdown
Contributor Author

JoshRosen commented May 20, 2017

@sksamuel, my original thought was that this inspection would be too noisy for folks who aren't writing performance-intensive code. Even in Spark itself, this rule flags many cases which aren't serious performance problems due to small collections sizes, small number of calls, etc.

I could see this inspection being pretty annoying for folks who aren't writing performance-sensitive imperative code in Scala. That said, we already have the opposite experience with the usage of var rule: the var rule flags tons of undesired warnings in our codebase but is probably useful with more functional code. I'm happy to re-open if this perhaps-niche rule is welcome.

@JoshRosen
Copy link
Copy Markdown
Contributor Author

If you do want to accept this rule, I'll re-open and add it to the README.


override def inspect(tree: Tree): Unit = {
tree match {
case Apply(Select(lhs, TermName("apply")), List(_)) if isSeq(lhs) && !isIndexedSeq(lhs) =>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth ignoring the apply(0) / apply(constant) case (like I did in my linter implementation.

@JoshRosen
Copy link
Copy Markdown
Contributor Author

I totally messed this up by force-pushing to a closed PR branch, so let me re-open. Sorry for the noise :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants